Adaptive Audio-Visual Speech Recognition with Distorted Audio and Video Data

نویسندگان

  • Martin Heckmann
  • Frédéric Berthommier
  • Christophe Savariaux
  • Kristian Kroschel
چکیده

Martin Heckmann , Frédéric Berthommier , Christophe Savariaux , Kristian Kroschel 3 1 Honda Research Institute Europe, 63073 Offenbach, Germany, Email [email protected] 2 Institut de la Communication Parlée (ICP), 38031 Grenoble, France, Email: {bertho, savario}@icp.inpg.fr 3 Institut für Nachrichtentechnik, Universität Karlsruhe, 76128 Karlsruhe, Germany, Email: [email protected]

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Improved Speech Recognition using Adaptive Audio-visual Fusion via a Stochastic Secondary Classifier

The adaptive fusion of video and audio is one of the fundamental pursuits of audio visual speech recognition (AVSR). In this paper the use of a high dimensional secondary classijier on the word likelihood scores from both the audio and video modalities is investigated fo r the purposes of adaptive fusion. Results are presented that lie above or equal to the boundary of catastrophic fusion acros...

متن کامل

Adaptive Audio-visual Speech Recognition in the Presence of Audio and Video Distortions

Audio-visual speech recognition leads to significant improvements compared to pure audio recognition especially when the audio signal is corrupted by noise. In this article we investigate the consequences of additional degradations in the video signal on the audio-visual recognition process.. We degrade the images with noise, a JPEG compression, and errors in the localization of the mouth regio...

متن کامل

Audiovisual speech recognition with missing or unreliable data

In order to robustly recognize distorted speech, use of visual information has been proven valuable in many recent investigations. However, visual features may not always be available, and they can be unreliable in unfavorable recording conditions. The same is true for distorted audio information, where noise and interference can corrupt some of the acoustic speech features used for recognition...

متن کامل

Open-Domain Audio-Visual Speech Recognition: A Deep Learning Approach

Automatic speech recognition (ASR) on video data naturally has access to two modalities: audio and video. In previous work, audio-visual ASR, which leverages visual features to help ASR, has been explored on restricted domains of videos. This paper aims to extend this idea to open-domain videos, for example videos uploaded to YouTube. We achieve this by adopting a unified deep learning approach...

متن کامل

Characteristics of the Use of Coupled Hidden Markov Models for Audio-Visual Polish Speech Recognition

This paper focuses on combining audio-visual signals for Polish speech recognition in conditions of highly disturbed audio speech signal. Recognition of audio-visual speech was based on combined hidden Markov models (CHMM). Described methods where developed for a single isolated command, nevertheless their effectiveness indicated that they would also work similarly in continuous audio-visual sp...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005